A Boosting Algorithm for Label Covering in Multilabel Problems
نویسندگان
چکیده
We describe, analyze and experiment with a boosting algorithm for multilabel categorization problems. Our algorithm includes as special cases previously studied boosting algorithms such as Adaboost.MH. We cast the multilabel problem as multiple binary decision problems, based on a user-defined covering of the set of labels. We prove a lower bound on the progress made by our algorithm on each boosting iteration and demonstrate the merits of our algorithm in experiments with text categorization problems.
منابع مشابه
Online Boosting Algorithms for Multi-label Ranking
We consider the multi-label ranking approach to multilabel learning. Boosting is a natural method for multilabel ranking as it aggregates weak predictions through majority votes, which can be directly used as scores to produce a ranking of the labels. We design online boosting algorithms with provable loss bounds for multi-label ranking. We show that our first algorithm is optimal in terms of t...
متن کاملIncorporating Prior Knowledge into Boosting for Multi-Label Classification XiaoWang
Multi-label learning deals with the problem where each instance may belong to multiple labels simultaneously. The task of the learning paradigm is to output the label set whose size is unknown a priori for each unseen instance, through analyzing the training data set with known label sets. Existing multi-label learning algorithms are almost based on the purely data-driven method. The larger the...
متن کاملLog-Linear Models for Label Ranking
Label ranking is the task of inferring a total order over a predefined set of labels for each given instance. We present a general framework for batch learning of label ranking functions from supervised data. We assume that each instance in the training data is associated with a list of preferences over the label-set, however we do not assume that this list is either complete or consistent. Thi...
متن کاملFast Label Embeddings via Randomized Linear Algebra
Many modern multiclass and multilabel problems are characterized by increasingly large output spaces. For these problems, label embeddings have been shown to be a useful primitive that can improve computational and statistical efficiency. In this work we utilize a correspondence between rank constrained estimation and low dimensional label embeddings that uncovers a fast label embedding algorit...
متن کاملATPboost: Learning Premise Selection in Binary Setting with ATP Feedback
ATPboost is a system for solving sets of large-theory problems by interleaving ATP runs with state-of-the-art machine learning of premise selection from the proofs. Unlike many previous approaches that use multi-label setting, the learning is implemented as binary classification that estimates the pairwise-relevance of (theorem, premise) pairs. ATPboost uses for this the XGBoost gradient boosti...
متن کامل